Goto

Collaborating Authors

 finite sample


Finite Sample Analysis of the GTD Policy Evaluation Algorithms in Markov Setting

Neural Information Processing Systems

In reinforcement learning (RL), one of the key components is policy evaluation, which aims to estimate the value function (i.e., expected long-term accumulated reward) of a policy. With a good policy evaluation method, the RL algorithms will estimate the value function more accurately and find a better policy. When the state space is large or continuous \emph{Gradient-based Temporal Difference(GTD)} policy evaluation algorithms with linear function approximation are widely used. Considering that the collection of the evaluation data is both time and reward consuming, a clear understanding of the finite sample performance of the policy evaluation algorithms is very important to reinforcement learning. Under the assumption that data are i.i.d.





To all reviewers, thank you very much for your thoughtful comments and suggestions

Neural Information Processing Systems

To all reviewers, thank you very much for your thoughtful comments and suggestions. R#1: "...importance of similarity among the selected tasks... " R#1: "...domain randomization, when enough samples are used, is a better alternative to meta-learning... " R#2: "...Theorems 1 and 2 are asymptotic... " Hence, the theorems are NOT asymptotic. We will remove the asymptotic parts for clarity. R#2: 'Assumption 2 ... the per-task optimal models are centered around the corresponding optimal solutions. This assumption can easily be dropped with the cost of including the distance as a term.



Learning with Incremental Iterative Regularization

Lorenzo Rosasco, Silvia Villa

Neural Information Processing Systems

Within a statistical learning setting, we propose and study an iterative regularization algorithm for least squares defined by an incremental gradient method. In particular, we show that, if all other parameters are fixed a priori, the number of passes over the data (epochs) acts as a regularization parameter, and prove strong universal consistency, i.e. almost sure convergence of the risk, as well as sharp finite sample bounds for the iterates. Our results are a step towards understanding the effect of multiple epochs in stochastic gradient techniques in machine learning and rely on integrating statistical and optimization results.



An Equal-Probability Partition of the Sample Space: A Non-parametric Inference from Finite Samples

Eriksson, Urban

arXiv.org Machine Learning

This paper investigates what can be inferred about an arbitrary continuous probability distribution from a finite sample of $N$ observations drawn from it. The central finding is that the $N$ sorted sample points partition the real line into $N+1$ segments, each carrying an expected probability mass of exactly $1/(N+1)$. This non-parametric result, which follows from fundamental properties of order statistics, holds regardless of the underlying distribution's shape. This equal-probability partition yields a discrete entropy of $\log_2(N+1)$ bits, which quantifies the information gained from the sample and contrasts with Shannon's results for continuous variables. I compare this partition-based framework to the conventional ECDF and discuss its implications for robust non-parametric inference, particularly in density and tail estimation.


Finite Samples for Shallow Neural Networks

Xia, Yu, Xu, Zhiqiang

arXiv.org Artificial Intelligence

This paper investigates the ability of finite samples to identify two-layer irreducible shallow networks with various nonlinear activation functions, including rectified linear units (ReLU) and analytic functions such as the logistic sigmoid and hyperbolic tangent. An ``irreducible" network is one whose function cannot be represented by another network with fewer neurons. For ReLU activation functions, we first establish necessary and sufficient conditions for determining the irreducibility of a network. Subsequently, we prove a negative result: finite samples are insufficient for definitive identification of any irreducible ReLU shallow network. Nevertheless, we demonstrate that for a given irreducible network, one can construct a finite set of sampling points that can distinguish it from other network with the same neuron count. Conversely, for logistic sigmoid and hyperbolic tangent activation functions, we provide a positive result. We construct finite samples that enable the recovery of two-layer irreducible shallow analytic networks. To the best of our knowledge, this is the first study to investigate the exact identification of two-layer irreducible networks using finite sample function values. Our findings provide insights into the comparative performance of networks with different activation functions under limited sampling conditions.